Topic-based Evaluation for Conversational Bots
نویسندگان
چکیده
Dialog evaluation is a challenging problem, especially for non task-oriented dialogs where conversational success is not well-defined. We propose to evaluate dialog quality using topic-based metrics that describe the ability of a conversational bot to sustain coherent and engaging conversations on a topic, and the diversity of topics that a bot can handle. To detect conversation topics per utterance, we adopt Deep Average Networks (DAN) and train a topic classifier on a variety of question and query data categorized into multiple topics. We propose a novel extension to DAN by adding a topic-word attention table that allows the system to jointly capture topic keywords in an utterance and perform topic classification. We compare our proposed topic based metrics with the ratings provided by users and show that our metrics both correlate with and complement human judgment. Our analysis is performed on tens of thousands of real human-bot dialogs from the Alexa Prize competition and highlights user expectations for conversational bots.
منابع مشابه
A Conversational Agent Based on a Conceptual Interpretation of a Data Driven Semantic Space
In this work we propose an interpretation of the LSA framework which leads to a data-driven “conceptual” space creation suitable for an “intuitive” conversational agent. The proposed approach allows overcoming the limitations of traditional, rule-based, chat-bots, leading to a more natural dialogue.
متن کاملEmulating Human Conversations using Convolutional Neural Network-based IR
Conversational agents (“bots”) are beginning to be widely used in conversational interfaces. To design a system that is capable of emulating human-like interactions, a conversational layer that can serve as a fabric for chat-like interaction with the agent is needed. In this paper, we introduce a model that employs Information Retrieval by utilizing convolutional deep structured semantic neural...
متن کاملModelling Affordances for the Control and Evaluation of Intrinsically Motivated Robots
In psychological theory, affordances provide a way to describe an environment in terms of the opportunities it provides an organism to act. Affordance-based models have been applied to robotics in areas such as tool-use, interaction and vision, as an alternative to hybrid control architectures. This paper introduces a model of affordances for controlling and evaluating intrinsically motivated r...
متن کاملRanking Responses Oriented to Conversational Relevance in Chat-bots
For automatic chatting systems, it is indeed a great challenge to reply the given query considering the conversation history, rather than based on the query only. This paper proposes a deep neural network to address the context-aware response ranking problem by end-to-end learning, so as to help to select conversationally relevant candidate. By combining the multi-column convolutional layer and...
متن کاملTopic Segmentation and Labeling in Asynchronous Conversations
Topic segmentation and labeling is often considered a prerequisite for higher-level conversation analysis and has been shown to be useful in many Natural Language Processing (NLP) applications. We present two new corpora of email and blog conversations annotated with topics, and evaluate annotator reliability for the segmentation and labeling tasks in these asynchronous conversations. We propos...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1801.03622 شماره
صفحات -
تاریخ انتشار 2018